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ABSTRACT 

Exporting the results of statewide assessment looms 
as a problem as (aore states pass from the planning to i45plementation 
phas^ in their assessment programs. When energies are fo^u^ed on the 
purpose of the ai^sessment, formulating objectives, and instt^ent 
construction, reporting takes a back seat because it happens last. 
There are some general principles to be followed in order to report 
effectively the results of a large scale assessment program. Thisr 
paper begins with several recent references on how to report the 
results of large scale assessment programs. The remainder of this 
paper is intended to provide specific new thoughts for implementation 



of old principles. The ultimate succesis of 
will depend on how well assessment results 
various audiences. In this paper, the' most 
for improving reporting practices are plan 
reports for different audiences, and field 
determine the language and content that, are 



state assessment programs 
are reported to their 
compelling recommendations 
ahead, develop different* 
test report formats to 
most meaningful to 



respective audiences. Reporting should deceive the same careful 
attention as instrument Construction with sufficient opportunity for 
feedback from intended users. (Author/DEP) 



/ 

/ 



♦ Documents acquired by ERIC include many informal unpublished ♦ 

♦ materials npt available from other sources. ERIC makes every effort ♦ 

♦ to obtain the best copy available, nevertheless, items of marginal ♦ 

♦ reproducibility are often encountered and this affects the quality ♦ 

♦ of the microfiche and hardcopy reproductions ERIC makes available ♦ 

♦ via the ERIC Document Reproduction Service (EDES). EDRS is not ♦ 

♦ responsible for the quality of the original document. Reproductions ♦ 
♦supplied by EDRS are the best that can be made from the original. ♦ 



us O^PAKTMCNT OF HEALTH 
EOUCATiON A WELFARE 

NATIONAL INSTITUTE OF \ 

• rc\o°'^xrcfS'Ksnfv\o"rr. REPORTING THE RESULTS OF STATEWIDE ASSESSMENT 

tHf PERSON OR ORGANIZATION OHiGiN 

ATiNO tt POtNTjS Of VIEW OR OPINIONS ' » , 

STATED 00 NO^T NEccbSARiLY REPRE Lorrlo SheDdrd 

StNTOtf'OAL NATIONAL INSTITUTE Of t-VI ! I C wUli-pulU 

eotrA-ioN pos'TtoN OR POLICY UnlveTsitv of Colorado 

Reporting the results of statewide assessment looms as a biqqer problem as 
^ more states pass from the planning to impl^entation phase in their assessment 
0s programs. When energies were focused on the purpose of the assessment, formulating 
^ objectives, and Instrument construction," reporl.ng took a back seat because it 
Q would happen last. That was, of course, our first mistake, which is one of the 
points made In this paper. Reporting is a bigger problem now becauie (1) now we 
actually have to do It, (2) we haven't given reporting the same kinci of attention 
we've given to problems in test development or sampling, and (3) all the errors 
In other aspects of the assessment program accumulate in the reports.. 

There are some general principles to be followed in order to report effec- 
tlvely the results of a large scale assessment program. These principles are 
not new. For example. Bob Stake suggested some time ago that evaluation reports 
should be tailored for specific audiences. If state agencies and contractors 
are having trouble writing reports that will be read, it is not because they 
haven't heard the rules often enough, but because there is much that can go wrong 
between the theory of what makes a good report and putting it into fractice. 
This paper begins with several recent references on how to report the results of 
Jn^ large scale assessment programs. The remainder of the paper is intended to pro- 
vide specific new thoughts for implementation of old principles. 

The single most important reference on assessment is frank Womer's monograph. 
Developing a Large Scale Assessment Program . It is commendable not just because . 
it provides useful guidelines, but also because the author suggests specific 
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reporting ntrateqies within tho context of a total assessment plan. The 
Cooperative Accountability Project also sponsored a three-part docuirent, A 



Dissemination System for State Accountability Programs, written by professors of 
communication, Bettinghaus and Miller. One of its most valuable elements is a 
set of recommendations for dealing with the news media, A final re-^erence Ed 
Larsen *s Suggestions for Talking to School - Community Groups about Te sting and Test 
Resul ts > Additional references should include the proceedings of numerous con- 
ferences for state assessment personnel where reporting problems and solutions 
have been major topics, such as the ETS Conference for directors of state testing 

programs, the National Assessment workshops and a meeting of eight states in 

/ '''' 
Florida sponsored by USOE. Unfortunately, this wisdom has not been published. 

The attempt in this paper to build upon the knowledge already available is there-- 

fore limited by the author's attendance at some but not all of the meetings, 

1 . Plan Ahead : Reporting s hould receive as_ much attention as test construction ^ 
The cardinal rule for good reporting i^s to decide what should be reported 

before planning the assessment. It should be the first step, not the last. Only 
the minor cosmetic aspects of reporting can safely be postponed, 

2. Different Reports for Different Audience s, 

A second overriding principle, whose elaboration will provide for implemen- 
tation of the first, is that reports should be tailored for specific audiences. 
Different audiencf^s need different information. This has implication for both 
the content and format of assessment reports. 

Choosing the appropriate content is clearly the more important consideration. 
How to properly display Irrelevant data is a foolish question. The content of a 
report not only determines whether it will bo useful to its audience; the intended 
content of reports will dictate the design of the assessment, instrumentation, 
data collection, andv analysis. Further discussion of assessment content has been 
omitted, however, because it receives full attention in the accomparying paper by 
Jim Impara, 



In this paper, the more mundane questions of format are addressed. What is 
the proper length, wording, organization and medium that will suit each audience? 
These may In fact be the variables that determine whether the well -chosen infor- 
mation Is received. 

The best technique for identifying audiences is that of sample reports, 
recommended by Frank Womer and earlier by Bob Stake. Sample reports can be 
simple sentences that exemplify the choices such as whether the results are to 
be reported for a state, district, school, or individual, and whether the criteria 
will be percent passing a pre-specified number of items or in reference to some 
norm. Frank Womer offers ten of these examples as part of his discussiorT about 
how to determine the purpose of a large scale assessment. These decisions must 
be made at the Oitset, and according to Womer, the "best way to 'foice' those 
persons who establish policy and purpose is "to give them a series ol possible 
types of reports and have them decide which ones provide the type oi' information 
they really want." (p. 19) 
' Once audiences have been identified as well as the information appropriate 
to each, then the selected sample reports become the basis for furttier elabora- 
tion of specific reporting strategies. 

Planning sessions could begin by making two lists, one of the possible 
audiences of the assessment anc the other of types of information available. 
Obviously, the more fully implemented the assessment program is by the time the 
planning meetino takes place, the more constraints there will be on the second 
list. A matrix can then be constructed with potential audiences along one 
dimension such as legislators, classroom teachers, parents, superintendents, 
reading specialists and educational researchers, and along the other dimension 
would be examples of assessment results such as a pupil's scores on reading 

» 

objectives, statewide averages in comparison to national norms, or district level 



subtest scores In relation to pre-specif led criteria. The matrix is useful 
inltVany to make certain that none of the Important types of inforn.ation or 
audiences are left out. Real progress will be made in planning, hov^^ver, when 
the matrix begins to collapse; once it is possible to identify similarities in 
the needs of certain audiences, then it is possible to specify a manageable 
number of report types with each addressed to specific information requirements. 

Planning for reporting is complicated somewhat by the interaction between 
report content and a third dimension, report format. Clearly content choices 
should govern the selection of report format. But it is only posslDle to pre- 
sent certain kinds of Information in a one-nninute television segment. There 
will be some audience characteristics, such as technical understanding, political 
perspective, or attention span, that will make some modes of presentation unac- 
ceptable. These limitations on format may 1n^.:turfrcause reconsideration of the 
kinds of Information that can reasonably be included in each basic report type. 

3, All Reports Ar^^ 

When ^peaking of state assessment reports, one usually means written reports. 
This habit occurs because written reports are proportionally the biggest share of 
state department repori^, certainly if measured by weight. Many of the sugges- 
tions offered In this paper are most appropriate for written documents, but this 
should not cause us to overlook the virtues of other media. Slide shows or 
fllmstrip presentations are visually alluring and frequently hold an. audience*s 
attention longer t n the same words and gwphs would in a written document. 
There are costs Involved, of course, and a slide presentation usually requires 
more In dollars and in staff expertise. Written reports have more to recommend 
them than lower development costs. They are also more easily referenced; two 
months later. It Is easier to refer back to page 27 than to return to the third 
segment In a film presentation. In addition, the total cost of reporting will 
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Involve a trade-off between the cost of each copy and the number cf individuals 
or agencies who can have their own copy of the report. When enormojs 'numbers 
are, required, or when each report is individualized (e.g,, different, information 
for each district) then audio-visual presentations are less feasibl?. Neverthe- 
less, for one-time audiences who need an overview of the state results, a media 
presentation may be the most effective. 

4 , Personal Contact Enhances Reporting , | 

The examples of different media offtjred above are all potentially long- 
distance transmitters of assessment results. But, an important rule for success- 
ful reporting is that the reports should be delivered personally. If those who 
are responsible for reporting results can convey them directly to the respectivi 
audiences, whether state legislators or district superintendents, there is an 
Increased likelihood that the message will be received. Face-to-face contact 
ensjjreiT that the reports will be looked at and provides the opportunity to answer 
technical or interpretive questions that cannot be answered by a written document. 

This may seem an outrageous proposal, especially from a Californian who 
should know that such a practice is impossible. But even in California, where 
an assessment staff of six faces more than 1,000 districts end 5,000 schools, 
some personal contact is provided by means of workshops. Area workshops are held 
where district personnel can receive the reports and suggestions about how to 
read and interpret the reports. The district personnel then provide a direct 
"contact for school principals and teachers. In most states, state assessment 
staff make verbal reports to the legislature, but in many states this is the only 
face-to-face reporting, Pennsylvania's example is rare where state assessment 
staff visit every district involved in the assessment. Some additional states 
hold public meetings throughout their states to provide a forum for the 
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dissemination of assessment results. Such meetings or the kinds of workshops 
described in California should be conside^^ed as a minimal response to the require- 
ment for personal contact in reportinq results. 

5, Reports Should be Journal istic Rather Than Scholarly . 

Reports should be less like dissertations and more like newspapers. This 
admonition was prompted by the dreariness of many written reports^ but it has 
implication as well for the organization of information in slide shows and 
^verbal presentations. 

Dissertations are typically organized following a laborious loqic: albeit 
based on the reasoning of the scientific method, such organization is useful only 
if one reads the entire document. Similarly, state assessment reports usually 
begin at the beginning with the purposes of the assessment, goal development, the 
hierarchy of performance objectives, etc., leaving the important infonnation , the 
results, buried in the middle between the introduction and appendices. 

The recommendation for a more journalistic style, is based on the assumption 
that most readers will not read the entire report even if, guided by principle 
number two, its length and content have been fashioned especially for them. 
Following the old who, what, when, where, and how paradigm, information should 
be organized so that the most salient results are presented first. Each suc- 
ceeding paragraph that is added to the narrative should ..summarize the most impor- 
tant Information from what remains unsaid. This ordering of information from 
most to least newsworthy will ensure the greatest information pay-off for each 
reader regardless of where he stops reading. Of course, when the assessment 
results are extensive, with separate categorization by region, district, or 
background variables, each reader will have a different "most important part." 
But there are still some general rules to follow. At least make the results easy 
to find and separate from the purposes and procedures of the assessment. At 
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least give the "big group" results or statewide findings before reporting for 
subgroups or by background variables. 

Newspaper reporters should also be mimicked when selecting those aspects of 
the assessment results whicfi are most important to respective audiences. In 
general, differences are more interesting than similarities. If thn results this 
year are the same as last yearns in all subject areas except math, then the head- 
line infonpatlon Is whether the ^nath scores are up or down. 

6, Reports Should be Shorter . 

Principle'number six follows directly from principle number five. In addi- 

s 

tlon to saying the most important information first, sometimes reports should 
then stop immediately. The length of a report is one of the variables that can 
be most successfully manipulated to alter reports for their respective audiences. 
National Assessment uses "Executive Surrnnaries" and some states are beginning to 
create separate abbreviated documents especially for the legislative audience. 
This practice should be expanded. If assessment results are going to ^ttract 
the attention of legislators, taxpayers, parents, and newsmen, they will have to 
be brief. Why not develop a single page summa**y entitled "Mcjor Findings of the 
1975 Statewide Assessment?*' Such a page should be available separately as well 
as being the first page of thicker documents. Those who are intrigued by the 
information in the single page can certainly, and are more likely to, read 
further. Save the elaborate breakdowns by subject subcategories, such as pre- 
fixes and suffixes or consonant blends and digraphs, for the subject matter 
experts. 

t 

7* Data Pi splays Wisely and They wil 1 Carry thr Report . 

Graphs and tables are frequently more effective summaries of results than 
pages and pages of narrative. The first rule for using them wisely is that too 
many will spoil their effectiveness.. Reams of tables should be relegated to the 
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appendix leaving only selective examples in the text of tho report. This rule 
applies as well to slide shows and film presentations. 

Tables should be well labeled so that they stand on their own. A second 



coordinated with the text so as to provide the reader with examples of how to 
Interpret the particular intersections of rows and columns. The possible contra- 
diction is based on the assumption that the author of the report cannot know how 
each reader will extract information from the report. Some readers attend to 
only the tables and graphs, others read the words and skip the figures. 

Reports are frec^uently computer generated, especially when results are 
reported for Individual dlstricts^and schools. In these instances, pre-printed 
forms or computer written sentences should be used to clarify the moaning of 
numbers In the qiant tables that result. In California, one of the best received 
forms was the report to districts and schools of the 1974 second and third grade 
reading assessment results. The two page computer generated form had four major 
sections: total reading test results in comparison to national norms, total test 
results in comparison to predicted scores based on multiple regression, a summary 
of background data, and results by subcontent areas in reading (e.g., phonetic 
analysis, consonants and vowels, under word identification skills). The first 
section Is reproduced here: 



and seemingly contradictory recommendation is that the use of tables should be 



Reading Test Results 



Mcun Test Score 




Grdde 



Level 



National 



State 



Schoc 
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The computer written sentences are redundant with what is said in the table and 
In the general description. But it was the individualized sentenv^es that were 
the most popular aspects of the report. They made it possible for those who were 
unfamiliar with the technical terms to verbalize the assessment results. 

Graphs are often better than tables because they make similarities and dif- 
ferences more visible. Graphs may be the only "analysis" that is required for 
some audiences. The best rules for using graphs wisely are the old rules. For 
example, quantitati vescales should include zero so as not to exaggerate small 
dIffeVences. A good reference is Chapter Three in Statistical Methods in Educa- 
tion and Psychology by Glass and Stanley (1970), In addition, the successful ness 

of graphs will be enhanced by good labelling and^by one-liners that repeat the 

\ 

message conveyed by the graph. In rhe California example used above, the 
following sentence appeared under a ^ aph where the individual district score was 
displayed In relation to the distnbulion of district scores statewide: "When 
district averages are ranked statewide, ttfe middle score (median) is 67*37 for 
grade 2 and 81.85 for grade 3. These scores can be thought of aS the performance 
of an 'average district' in the state/*. Further information v/as then provided 
^-.regarding the relationship of a district's score to the percentile scale. 

8. Save Technical Explanations for Footnotes or Technical Supplements . 

Statistical analyses are pursued presumably to increase the meaningful ness 
of raw score results^ They should never obscure the information. Unfortunately, 
when a statistical tool is (used, it sometimes becomes so dominant in the narra- 
tive, that it overshadows the assessment results. For example, if "the standard 
error of the difference" is repeated a dozen times in a single paragraph, the 
reader is likely to worry more about his understanding of 'this statistic rather 
than the implications of the assessment results. It is preferable to translate 
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statistical principles into general rules of thumb for the reader so that the 
statistical qualifiers will not have to be repeated at every turn. Here is 
another example from the California report intended for district and school per- 
sonnel • Spea_Mng of subscores from the reading test: 

The percentile ranks arc reported in the fourth column as bands 
or ranges rather thdn single points in order to show the error 
associated with test scores. Thp bands are shown on thp qraph 
to discourage over-interpretation of small differences in sub- 
test scores. The more important differences are likely to be 
those where the bands do not overlap. 

In the same report, multiple regression was used to compute expected scores for 

each school or district for compar>ative purposes. In the Brief narrative on the 

computer form, the term multiple rjegression was not^used; instead, an effort was 

made to present the same concept in lay terms: 

The numbers in the second column denote the range of scores that#^ 
are most likely to be obt£(ined by districts or schools like 
yours. This "Comparison Score Range" was comnuted using the 
background factors below. 

IN GRADE 2 MOST SCHOOLS LIKE YOURS SCORED IN THE! RANGE 
FROM THE 41 ST TO THE 69T>I PERCENTILE. j 

Technical supplements were provided for those who were interested in how each of 

the> background variables was operational ized and in the beta weights used in the 

regression equation. 

9. Overcome Stati j^tical Con servatism . 

Even more bothersome than the technical vocabulary of the statistician is 
the conservative training from inferential statistics which savs, "we can never 
know anything with certainty." On occasion, this conservatism has the effect of 
making all assessment results seem equivocal. This leaves non-technical audiences 
with the feelinq that none of the results can be trusted and wondering what nood 
the assessment was anyway. 
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Of course, there are errors due to sampl inq.aod imperfections in the 
measurement doviro^ and less estimable errors such as those due to differences 
In test administration or school by test-content interactions. Put presumably, 
these arc not so enormous as to entirely invalidate the assessment results. If 
this were tho case» tho , assessment proqram should have been called off short of 
printing the results. Tho appropriate statistical tools should be used to es-^i- 
mate the, magnitude of errors associated with particular scores or differences in 

scores. These should be used to preclude over-interpretation_of small differ,- 

.1 

ences. In addition, some initial disclaimer may be called for concerning the 
appropriateness of tne test content for making some decisions but not others. 
But then, the disclaimers ax^d equivocati ons should cease. The report shouHd make 
the best statement possible using the decision rules f;rom inferential statistics, 
and then let the statement stand. If sixth grade ma^h scores have gone up more 
than could be accounted for by sampling fluctuation or differences in test 
administration, then the report writer should say that there has been a trust- 
worthy change in the level of pupil perfonnance. 

.DQn'.t Ob scure the I nfonnation . 

In a few noteworthy instances, assessment results have been meeningless to 
their audiences ^t because of statistical conservatism or lack of iournalistic 

skill, but because it has been the intention of the report author^ or the state 

\ 

agency to obscure the information. Repeated examples of misinterpretation of 
assessment results bv newspapers and legislators have made state assessment per- 
sonnel protective of themselves and local educators. It is obviously very 
difficult to get an undistorted messag^del ivered to the public sector. But it 
is unforgivable to react by shrouding the assessment results in statist. cal or 
educational jargon. The most common ploy is to produce such a surfeit of data 
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that neither the press nor anyone else can mak* any sense out of it. Another 
example occurred at a recent workshop where representatives from one state 
braqqpd about their practice of reporting district averaqes in relation to the 
state average for pupils rather than in comparison to the mean or modian of 
district scores. Because larqe districts In their state tend to be low scorinq 
districts, the result wa^ - than half of the districts had results which 

could be called "above avoiv^je." This they believed was the best of all possible 
worlds. These comments'are not intended to dispute the comparative strategy used;„ 
each reference s^tistic has different meaning and selection should be based on 
the purpose of the comparison. What was clearly wrong, in the opinion of this 
author, was the boast that significant audiences were fooled rather than en- 
lightened by the assessment results. How can such practices be, sanctioned by the , 
same Individuals who aidvocate assessment because Mt will provide useful informa- 
tion? \ 




1 1 . Make Comparative and Interpretive Informatio n a Part of the Repor 

Comparisons are essential if respective audiences are to derive meanina from 
assessment results. Whether in relation to expected performance based on pro- 
fessional judgment, or in relation to a normative standard, comoarative informa- 
tion should be as much a part of the assessment report as the raw data. 

Interpretive information about tne implications of the assessment r^^sults 
are necessary to ensure that the information will be used. In his monograph, 
Frank Womer u^ged that interpretation be built into the total plan. Interpreta- 
tion, as to evident strengths and weaknesses and possible courses of action, can 
only be made by subject matter experts; but it is the responsibil itv of the 
assessment staff to collect and report, the interfjreti ve information as well as 
the pupil performance data. National Assessment arranges fcr subject matter 
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experts to reactr-tft^the assessment results and to recommend public policy chanqes 
that may be warranted at the national level or curricular chanqes that miqht be 
doslmblo at the local level. \ Oregon and Maine are two of the few states where 
Internretlvo information has been provided as part of the reporting plan. Com- 
mittees of teachers and reading specialists outlined what results could mean and 
what possible responses would be appropriate from state and local agencies, 

Fie^ld Test Reports . ^ / 

_ This final _reconine_ndati on does not occur among the old rujes for successful 
reporting. Field testing is the best means for implementing the advice offered 
in the foregoing principles. There is no way of anticipating the ex.ict content, 
vocabulary, or organization that will be best for a particular audience. Reports 
should be tried out with their intended users in the same way that assessment 
Instruments are field tested. Initially, a small number of respondents is 
required. Once major flaws in wording or formating have been eliminated, more 
systematic sampling shoilild be done to learn from users what type of report Will 
be most .effective. Field testing should be done long before actual assessment 
results are available, either by using old test scores to approximate the new 
assessment data or by simulating results. In this way, an improved reporting 
format will be available by the time that the results are analyzed. 

Fo1low-un studies should be used to facilitate the continued improvement of 
assessment reports. Interviews with a selected sample from an intended audience 
will provide feedback as to the actu'^1 meaning inferred from the narrative and 
data displays in a report. More extensive sampling can also be done by guestion- 
nalr^ to learn of the actual uses made of assessment results. Unintended uses 
may be worth addressing directly in subsequent assessments. In both field 
testing and follow-ups, the principles enumerated in this paper should provide 
categories for asking audiences to evaluate the utility of a report. 
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Conclusion 

The ultimate success of state assessment programs will depend on how well 
assessment results are reported to their various audiences. In this paper, the 
most compel linq recommendations for improving reporting practices are principles 
one, two, and twelve: plan ahead, develop different reports for different 
audiences, and field test report formats to determine the language and content 
that are most meaningful to respective audiences. Reporting should receive the 
same careful attention as instrument construction with sufficient ooportunity 
for feedback from intended users. 
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